Equivalence of laboratory-developed test and PD-L1 IHC 22C3 pharmDx across all combined positive score indications

We conducted an analysis across multiple PD-L1 combined positive score (CPS) indications to establish concordance of a 22C3 antibody–based laboratory-developed test (LDT) on the Ventana BenchMark XT or BenchMark ULTRA platform and the regulatory-approved PD-L1 IHC 22C3 pharmDx in cervical cancer (CC), esophageal squamous cell carcinoma (ESCC), head and neck squamous cell carcinoma (HNSCC), triple-negative breast cancer (TNBC), and urothelial carcinoma (UC). Tumor specimens from each tumor type were stained with 22C3 antibody and scored using the 22C3 antibody–based LDT, and scores were compared with those using PD-L1 IHC 22C3 pharmDx. PD-L1 status was measured by the pathologist using CPS as a continuous score and using clinically relevant cutoffs (CC, ≥1 and ≥10; HNSCC, ≥1 and ≥20; ESCC, TNBC, and UC, ≥10). The agreement between the BenchMark platforms and PD-L1 IHC 22C3 pharmDx was assessed by intraclass correlation coefficient (ICC) and a contingency table for clinical interpretation. A total of 522 samples were evaluated for the pan-tumor analysis (CC, n = 77; ESCC, n = 80; HNSCC, n = 126; TNBC, n = 118, UC, n = 121). Most clinical interpretations of PD-L1 status were concordant between the BenchMark XT and PD-L1 IHC 22C3 pharmDx for all five tumor types with regard to negative percentage agreement (NPA; 83–97%), positive percentage agreement (PPA; 86–100%), and overall percentage agreement (OPA; 90–97%); the ICC by tumor type was high (≥0.88). Importantly, the pan-tumor ICC was 0.95 (95% CI 0.94–0.96). Thirty additional TNBC samples were evaluated using the BenchMark ULTRA and PD-L1 IHC 22C3 pharmDx; the NPA, PPA, and OPA were 100%. The 22C3 antibody–based LDT on Ventana BenchMark XT and BenchMark ULTRA platforms demonstrated high concordance with the regulatory-approved PD-L1 IHC 22C3 pharmDx across multiple tumor types. These findings suggest the comparability of PD-L1 IHC 22C3 pharmDx with an LDT based on the 22C3 antibody.

Introduction Pembrolizumab is a highly selective humanized monoclonal antibody that blocks the interaction between programmed death 1 (PD-1) and its ligands, PD-L1 and PD-L2, which helps restore T-cell responses against tumor cells [1][2][3][4]. The antitumor activity and safety of pembrolizumab have been established across a spectrum of solid and hematologic malignancies, which has led to its approval by the US Food and Drug Administration (FDA) and European Medicines Agency for these types of cancer [3,4]. PD-L1 expression is predictive of response to PD-1/PD-L1 inhibitors in several tumor types [5][6][7]. Accordingly, several indications for which pembrolizumab is approved by regulatory agencies are specifically for patients whose tumors express PD-L1 above certain thresholds as determined by an FDA-approved and/or CE-marked companion diagnostic, PD-L1 IHC 22C3 pharmDx (Agilent, Carpinteria, CA) [3,4,8]. For example, pembrolizumab is approved for the treatment of patients with recurrent or metastatic cervical cancer with disease progression on or after chemotherapy whose tumors express PD-L1 combined positive score (CPS) �1, as determined by PD-L1 IHC 22C3 pharmDx [3,8]. PD-L1 IHC 22C3 pharmDx is also clinically validated and approved by the FDA in selecting patients for pembrolizumab monotherapy or combination therapies who have non-small cell lung cancer (NSCLC; determined using tumor proportion score [TPS]), head and neck squamous cell carcinoma (HNSCC), esophageal squamous cell carcinoma (ESCC), and triple-negative breast cancer (TNBC) [3,8]. Similarly, several indications for pembrolizumab approved by the European Medicines Agency are specifically for patients whose tumors express PD-L1 by a validated test [4].
PD-L1 IHC 22C3 pharmDx is a qualitative immunohistochemical (IHC) assay using the monoclonal anti-PD-L1 clone 22C3 to detect PD-L1 protein in formalin-fixed, paraffinembedded (FFPE) tumor tissues and is designed for use on the Dako Autostainer Link 48 (Agilent) [8]. However, many pathology laboratories do not have access to the Autostainer Link 48 and therefore cannot use PD-L1 IHC 22C3 pharmDx to assess the PD-L1 tumor status of patients who may be candidates for treatment with pembrolizumab. These laboratories often assess PD-L1 status using the 22C3 monoclonal antibody-based tests and laboratory-developed tests (LDTs) on the Ventana BenchMark XT or Benchmark ULTRA (Roche Diagnostics, Basel, Switzerland), both widely available IHC platforms. However, data on the performance of these LDTs compared with PD-L1 IHC 22C3 pharmDx are scant.
An analytical harmonization study, which fits the 22C3 monoclonal antibody to the Ventana BenchMark XT platform, has been published [9]. High concordance between PD-L1 assessments in NSCLC samples (scored by TPS measuring PD-L1 in tumor cells only) and assessments made using PD-L1 IHC 22C3 pharmDx was demonstrated using an LDT [9]; the scoring algorithm with CPS (PD-L1 in tumor cells and tumor-associated lymphocytes and macrophages) was not evaluated.
The aim of the current study was to establish the concordance between the 22C3 antibodybased LDT and the gold standard PD-L1 IHC 22C3 pharmDx using CPS in five tumor types separately (cervical cancer [CC], ESCC, HNSCC, TNBC, and urothelial carcinoma [UC]) and together in a pan-tumor analysis.

Tumor samples
Archival FFPE tumor blocks from patients with CC, ESCC, HNSCC, TNBC, and UC were sourced from various suppliers Precision for Medicine, BioIVT, and MTgroup). Samples considered eligible for this study were <7 years old per technical specifications and had preserved sharing website (available at: http://engagezone. msd.com/ds_documentation.php) outlines the process and requirements for submitting a data request. Applications will be promptly assessed for completeness and policy compliance. Feasible requests will be reviewed by a committee of MSD subject matter experts to assess the scientific validity of the request and the qualifications of the requestors. In line with data privacy legislation, submitters of approved requests must enter into a standard data-sharing agreement with MSD before data access is granted. Data will be made available for request after product approval in the United States and European Union or after product development is discontinued. There are circumstances that may prevent MSD from sharing requested data, including country or regionspecific regulations. If the request is declined, it will be communicated to the investigator. Access to genetic or exploratory biomarker data requires a detailed, hypothesis-driven statistical analysis plan that is collaboratively developed by the requestor and MSD subject matter experts; after approval of the statistical analysis plan and execution of a datasharing agreement, MSD will either perform the proposed analyses and share the results with the requestor or will construct biomarker covariates and add them to a file with clinical data that is uploaded to an analysis portal so that the requestor can perform the proposed analyses.

Immunohistochemistry
Archival FFPE tissue blocks were sectioned at a 4-μm thickness and attached to positively charged glass slides; slides were stored at room temperature. Staining was performed using consecutive serial sections �14 days after sectioning. PD-L1 IHC 22C3 pharmDx was performed on the Autostainer Link 48 platform following the manufacturer's specifications [8].
For the LDT, samples were stained with the 22C3 antibody (Agilent) diluted 1:33 in antibody diluent with casein using the BenchMark XT platform (Ventana). The LDT used for this analysis has been previously described [9]. Characteristics of the protocol using the 22C3 antibody on the BenchMark XT platform are summarized in Table 1. Briefly, the protocol used the ultraView kit (with amplification), and the following steps were performed: (1) specimens prepared in FFPE glass slides were selected (the "on" option on the machine was checked), (2) specimens were deparaffinized, (3) cell condition 1 was selected for 60 minutes, (4) primary antibody was incubated at 37˚C for 1 hour, (5) ultraView amplification was selected, and (6) specimens were counterstained with one drop of hematoxylin II for 4 minutes to provide very light nuclear details. Characteristics of the protocol using the 22C3 antibody on the BenchMark ULTRA platform were the same, with the exception of step 5, which included selecting amplification as well as mouse antibody amplification owing to the newer technology. This protocol was rigorously optimized and clinically tested as previously described [9]. The intensity and specificity were calibrated using the Agilent analytic controls and normal human tonsil tissue. Human tonsil tissue served as a benchmark and as on-slide controls for each clinical case.

Analysis
To minimize interobserver bias, all samples were reviewed, analyzed, and scored for PD-L1 CPS (score range, 0-100) by one trained pathologist (GWV) according to the same CPS clinical algorithm for PD-L1 IHC 22C3 pharmDx. First, the slides (per each indication) were stained using the Autostainer Link 48 platform and evaluated. Each slide was evaluated twice, with a washout period of 1 week. The final score was the average of both readings. Then, after a mandatory washout period of �3 weeks, the process was repeated for the LDT-stained slides, as before. The final BenchMark XT or BenchMark ULTRA platform-based LDT score was the average of both readings. Only after the reading was complete was the final score that was obtained using the LDT compared with that obtained using the gold standard (PD-L1 IHC 22C3 pharmDx). Scatterplots of CPS as a continuous variable were generated using GraphPad Prism 6 (GraphPad Software, La Jolla, CA). The intraclass correlation coefficient (ICC) and Spearman's correlation coefficient (ρ) were used to analyze the concordance of CPS by the 22C3 antibody-based LDT on the BenchMark XT platform versus PD-L1 IHC 22C3 pharmDx. Analyses were performed by each individual tumor type and in a pan-tumor analysis. For an additional 30 TNBC samples, ICC and Spearman's ρ were used to analyze the concordance of CPS by the 22C3 antibody-based LDT on the BenchMark ULTRA platform versus PD-L1 IHC 22C3 pharmDx.
To determine PD-L1 status in this study, each CPS was transformed to clinical groups using the following standard clinical CPS cutoffs indicated with PD-L1 IHC 22C3 pharmDx: CPS <1 (negative), CPS �1 (positive), and CPS �10 (positive) for CC samples; CPS <10 (negative) and CPS �10 (positive) for ESCC, TNBC, and UC samples; CPS <1 (negative), CPS �1 (positive), and CPS �20 (positive) for HNSCC samples [8,10]. After PD-L1 status was determined using the LDT and PD-L1 IHC 22C3 pharmDx, the agreement between both assays was characterized with overall percentage agreement (OPA), positive percentage agreement (PPA), and negative percentage agreement (NPA) using a contingency table. OPA was defined as the number of agreed samples divided by the total number of samples. PPA was defined as the number of positive samples identified by both LDT and PD-L1 IHC 22C3 pharmDx divided by the total number of positive samples by PD-L1 IHC 22C3 pharmDx. Similarly, NPA was defined as the number of negative samples identified by both LDT and PD-L1 IHC 22C3 pharmDx divided by the total number of negative samples by PD-L1 IHC 22C3 pharmDx. Notably, NPA was not calculated for cervical cancer samples with CPS �1 because of the known and expected low number of PD-L1-negative cervical cancer samples; this sole constraint was decided before the analysis.

Analytic harmonization
pharmDx-positive control slides from Agilent were first used as analytic controls for both IHC platforms (Fig 1A). Because pathology laboratories without access to the Autostainer Link 48 platform cannot use this control, we also stained human tonsil tissue using PD-L1 IHC 22C3 pharmDx. Both resulted in a wide dynamic range of PD-L1 staining (Fig 1A and 1B). In the tonsil tissue samples, staining ranged from an intense dark-brown colored staining of the invaginated epithelium (indicative of strong PD-L1 expression; IHC score +3) to a lightbrown colored staining of the germinal center macrophages (indicative of weaker PD-L1 expression; IHC score +1) (Fig 1C and 1D). Importantly, staining the Agilent analytic control slide with the 22C3 antibody-based LDT on the BenchMark XT platform resulted in a very similar staining pattern (Fig 1C and 1D), as did staining serial cuts of human tonsil tissue ( Fig  1E and 1F). Notably, because of the wide availability of human tonsil tissue, it served as an onslide positive control for all the slides in this study.

Triple-negative breast cancer
A total of 118 TNBC samples were analyzed with the BenchMark XT platform. The correlation for PD-L1 CPS showed an ICC coefficient of 0.88 (Spearman's ρ = 0.83) (Fig 2D). Almost all   Table 2). An additional 30 samples were analyzed using the BenchMark ULTRA platform. The correlation for CPS showed an ICC coefficient of 0.94 (Spearman's ρ = 0.88) (Fig 2E). All samples were in agreement for CPS �10, for an OPA rate of 100% (Table 2).

Pan-tumor analysis
A total of 522 samples of various tumor types were acquired and included in the pan-tumor analysis. The correlation of PD-L1 CPS showed an ICC coefficient of 0.95 (Spearman's ρ = 0.93) (Fig 3).

Qualitative comparison
As with the analytic control, morphologic evaluation of stained tissue samples showed high similarity of PD-L1 staining patterns and intensities between PD-L1 IHC 22C3 pharmDx and the 22C3 antibody-based LDT on the BenchMark XT platform in CC, ESCC, HNSCC, TNBC, and UC samples (Fig 4).

Discussion and conclusions
Identifying patients who are most likely to benefit from a given therapy based on biomarkers is increasingly important in creating a personalized treatment plan in oncology. Although anti-PD-1/PD-L1 immunotherapy has improved the prognosis substantially across a spectrum of cancers, some patients do not respond to therapy [11]. Although a regulatory-approved companion diagnostic and a validated assay to facilitate patient selection for pembrolizumab, access to PD-L1 IHC 22C3 pharmDx is limited in some regions because of the lower presence of the assay platform, namely the Dako Autostainer Link 48, particularly outside the United States. Therefore, there is an urgent need for a validated LDT protocol that is easily adoptable using the widely available Ventana BenchMark XT or BenchMark ULTRA platforms and that produces reliable PD-L1 results.
Previous studies comparing validated PD-L1 IHC assays and PD-L1 LDTs based on different anti-PD-L1 antibodies have reported inconsistent and divergent results [12][13][14][15][16]. These inconsistencies may be partly attributed to several factors, including the following: small tumor sample size, choice of control tissue, potential variability in staining intensity of immune cells, variable staining intensity of immune/tumor cells for different antibody clones, and the potential subjective nature of PD-L1 scoring algorithms (i.e., clinically relevant cutoffs), which can give rise to interobserver variability in PD-L1 assessment by pathologists [12][13][14][15]. A recent meta-analysis assessing the diagnostic accuracy and interchangeability of PD-L1 IHC assays also suggested that when the testing laboratory is not able to use the regulatoryapproved companion diagnostic for PD-L1 assessment for its specific purpose, use of a validated LDT developed for the same purpose as the original PD-L1 regulatory-approved companion diagnostic is better than switching to another PD-L1 regulatory-approved companion diagnostic developed for a different purpose [16].
Using both a 22C3 antibody-based LDT on the Ventana BenchMark XT platform previously validated for NSCLC and the PD-L1 CPS algorithm, we evaluated PD-L1 expression in tumor samples from patients with CC, ESCC, HNSCC, TNBC, and UC and compared it with assessments using the gold standard, the regulatory-approved PD-L1 IHC 22C3 pharmDx. In our study, CPS as a continuous variable demonstrated a high correlation between the 22C3 antibody-based LDT on the BenchMark XT platform and PD-L1 IHC 22C3 pharmDx across all assessed tumor types. A high correlation between the 22C3 antibody-based LDT on the BenchMark ULTRA platform and PD-L1 IHC 22C3 pharmDx was observed in a small set of TNBC samples. Furthermore, high concordance between the 22C3 antibody-based LDT on the BenchMark XT or BenchMark ULTRA platform and PD-L1 IHC 22C3 pharmDx was also  reported for the clinical interpretations of PD-L1 status by tumor type. These findings suggest that this 22C3 antibody-based LDT has the potential to standardize PD-L1 scoring using the Ventana BenchMark XT or Benchmark ULTRA platforms, thereby serving an unmet need in pathology laboratories for PD-L1 testing.
Perhaps the best evidence of the equivalence of the two assays is the excellent correlation for CPS as a continuous variable (Figs 2 and 3). Although clinical interpretation of CPS ultimately leads to dichotomization of results around a cutoff to make a therapeutic decision, measures of concordance (NPA, PPA, and OPA) are influenced as much by the distribution of CPS values in proximity to the cutoff as by the analytic performances of the assays themselves. This agreement between two successive assays has been demonstrated by a mathematical model, which also showed that the clinical accuracy of dichotomized results can tolerate quite a bit of measurement uncertainty [17]. These findings are consistent with the notion that the LDT and PD-L1 IHC 22C3 pharmDx may be equally predictive of response to pembrolizumab, although they certainly do not offer proof. Of note, this is a limitation of most analytic concordance studies.
Potential limitations of this study include the relatively small sample set sizes for CC (n = 77) and ESCC (n = 80) compared with those for TNBC (n = 118 [BenchMark XT]; n = 30 [BenchMark ULTRA]), UC (n = 121), and HNSCC (n = 126). However, the analytic comparability of the LDT used in this study was already successfully assessed in NSCLC (using TPS) and the analytic comparability in this study was successfully assessed in TNBC, UC, and HNSCC (using CPS). For this reason, we determined that an assessment of the smaller sample set sizes for CC and ESCC was sufficient because the goal was to ensure that the protocol did not fall outside of a reasonable confidence level. We did not report on the NPA for CC data because of the high prevalence of CPS �1 (positive) in the CC samples, which dictated that a much larger cohort would be needed for the NPA to be calculated. This study also did not consider interobserver variability, given that all samples were scored by a single experienced pathologist.
In conclusion, our 22C3 antibody-based LDT on the BenchMark XT or BenchMark ULTRA platforms demonstrates high concordance with PD-L1 IHC 22C3 pharmDx for the assessment of PD-L1 CPS across multiple tumor types, including CC, ESCC, HNSCC, TNBC, and UC. These findings suggest that the 22C3 antibody-based LDT on the BenchMark XT or BenchMark ULTRA platforms performs comparably with the gold-standard PD-L1 IHC 22C3 pharmDx for the assessment of PD-L1 status and can therefore be used to identify patients who are indicated for pembrolizumab therapy. Lastly, findings from our study will help laboratories who do not have access to the approved companion diagnostic assay for pembrolizumab (PD-L1 IHC 22C3 pharmDx and/or the DAKO Autostainer platform) to use the validated protocol for using 22C3 antibody on a Ventana BenchMark XT platform (and Ventana BenchMark ULTRA platform for TNBC) and enable them to use existing platforms without requiring the purchase of new equipment.